Building ancient Spanish dictionaries for spell-checking of DL texts

نویسندگان

  • Alejandro Bia
  • Manuel Sánchez-Quero
چکیده

Being aware of the usefulness of spell-checkers on the correction of modern works, and lacking this facility for ancient texts, we decided to build dictionaries for ancient Spanish. This decision led to new problems and new questions. We have built a time-aware system of dictionaries that takes into account the temporal dynamics of language, to help solve the problem of ancient Spanish spell-checking. In this paper we present the problems we have found, the decisions we have made and the conclusions and results we arrived at.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spell Checking in Spanish: The Case of Diacritic Accents

This article presents the problem of diacritic restoration (or diacritization) in the context of spell-checking, with the focus on an orthographically rich language such as Spanish. We argue that despite the large volume of work published on the topic of diacritization, currently available spell-checking tools have still not found a proper solution to the problem in those cases where both forms...

متن کامل

Creating and Weighting Hunspell Dictionaries as Finite-State Automata

There are numerous formats for writing spell-checkers for open-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into finite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spellchecking sugg...

متن کامل

Rule-Based Spanish Morphological Analyzer Built From Spell Checking Lexicon

Preprocessing tools for automated text analysis have become more widely available in major languages, but non-English tools are often still limited in their functionality. When working with Spanishlanguage text, researchers can easily find tools for tokenization and stemming, but may not have the means to extract more complex word features like verb tense or mood. Yet Spanish is a morphological...

متن کامل

Compiling Apertium morphological dictionaries with HFST and using them in HFST applications

In this paper we aim to improve interoperability and re-usability of the morphological dictionaries of Apertium machine translation system by formulating a generic finite-state compilation formula that is implemented in HFST finite-state system to compile Apertium dictionaries into general purpose finite-state automata. We demonstrate the use of the resulting automaton in FST-based spell-checki...

متن کامل

Using Google to Create a More Accurate and Easily-Extensible Spell Corrector

Spell checkers are now a common, integrated part of many commercial and freely available word processing programs. Agglutinative languages (such as Hungarian and Finnish) pose a separate problem, as there are many different " correct " forms for any given word. Due to the seemingly infinite number of possible words, the limited scope of a dictionary (provided with most spell-checking software) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002